Parallel K-Means Algorithm for Shared Memory Multiprocessors

نویسنده

  • Tayfun Kucukyilmaz
چکیده

Clustering is the task of assigning a set of instances into groups in such a way that is dissimilarity of instances within each group is minimized. Clustering is widely used in several areas such as data mining, pattern recognition, machine learning, image processing, computer vision and etc. K-means is a popular clustering algorithm which partitions instances into a fixed number clusters in an iterative fashion. Although k-means is considered to be a poor clustering algorithm in terms of result quality, due to its simplicity, speed on practical applications, and iterative nature it is selected as one of the top 10 algorithms in data mining [1]. Parallelization of k-means is also studied during the last 2 decades. Most of these work concentrate on shared-nothing architectures. With the advent of current technological advances on GPU technology, implementation of the k-means algorithm on shared memory architectures recently start to attract some attention. However, to the best of our knowledge, no in-depth analysis on the performance of k-means on shared memory multiprocessors is done in the literature. In this work, our aim is to fill this gap by providing theoretical analysis on the performance of k-means algorithm and presenting extensive tests on a shared memory architecture.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Performance Characterization of Shared- and Distributed-Memory Multiprocessors on a Tree Search Problem

In this paper, we measure and compare the performance of sharedand distributed-memory multiprocessors using a parallel tree search problem to characterize these types of multiprocessors. We take the knapsack problem using the branch-and-bound algorithm as our workload. It is di cult to compare the performance using irregular parallel problems such as tree search problems because the parallelism...

متن کامل

Executing Nested Parallel Loops on Shared-Memory Multiprocessors

Cache-coherent, bus-based shared-memory multiprocessors are a cost-e ective platform for parallel processing. In scienti c parallel applications, most of the computation involves processing of large multidimensional data structures which results in a high degree of data parallelism. This parallelism can be exploited in the form of nested parallel loops. Most existing shared memory multiprocesso...

متن کامل

Automatic Localization for Distributed-Memory Multiprocessors Using a Shared-Memory Compilation Framework

In this paper, we outline an approach for compiling for distributed-memory multiprocessors that is inherited from compiler technologies for shared-memory multiprocessors. We believe that this approach to compiling for distributed-memory machines is promising because it is a logical extension of the shared-memory parallel programming model, a model that is easier for programmers to work with, an...

متن کامل

A Scalable and Eecient Storage Allocator on Shared-memory Multiprocessors

An eecient dynamic storage allocator is important for time-critical parallel programs. In this paper, we present a fast and simple parallel allocator for xed size block on shared-memory multiprocessors. We show both theoretically and empirically that the allocator incurs very low lock contention. The allocator is tested with parallel simulation applications with frequent allocation and release ...

متن کامل

A Study on the Impact of Memory Consistency Models on Parallel Algorithms for Shared-Memory Multiprocessors

Memory consistency model is an integral part of the shared-memory multiprocessor system, and directly affects the performance. Most current multiprocessors adopt relaxed consistency models in quest of higher performance. In this paper we study the impact of memory consistency model on the design, implementation and performance of parallel algorithms for graph problems that remain challenging du...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014